72 research outputs found

    The Statistics of Bulk Segregant Analysis Using Next Generation Sequencing

    Get PDF
    We describe a statistical framework for QTL mapping using bulk segregant analysis (BSA) based on high throughput, short-read sequencing. Our proposed approach is based on a smoothed version of the standard statistic, and takes into account variation in allele frequency estimates due to sampling of segregants to form bulks as well as variation introduced during the sequencing of bulks. Using simulation, we explore the impact of key experimental variables such as bulk size and sequencing coverage on the ability to detect QTLs. Counterintuitively, we find that relatively large bulks maximize the power to detect QTLs even though this implies weaker selection and less extreme allele frequency differences. Our simulation studies suggest that with large bulks and sufficient sequencing depth, the methods we propose can be used to detect even weak effect QTLs and we demonstrate the utility of this framework by application to a BSA experiment in the budding yeast Saccharomyces cerevisiae

    An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding

    Get PDF
    Regulatory proteins can bind to different sets of genomic targets in various cell types or conditions. To reliably characterize such condition-specific regulatory binding we introduce MultiGPS, an integrated machine learning approach for the analysis of multiple related ChIP-seq experiments. MultiGPS is based on a generalized Expectation Maximization framework that shares information across multiple experiments for binding event discovery. We demonstrate that our framework enables the simultaneous modeling of sparse condition-specific binding changes, sequence dependence, and replicate-specific noise sources. MultiGPS encourages consistency in reported binding event locations across multiple-condition ChIP-seq datasets and provides accurate estimation of ChIP enrichment levels at each event. MultiGPS's multi-experiment modeling approach thus provides a reliable platform for detecting differential binding enrichment across experimental conditions. We demonstrate the advantages of MultiGPS with an analysis of Cdx2 binding in three distinct developmental contexts. By accurately characterizing condition-specific Cdx2 binding, MultiGPS enables novel insight into the mechanistic basis of Cdx2 site selectivity. Specifically, the condition-specific Cdx2 sites characterized by MultiGPS are highly associated with pre-existing genomic context, suggesting that such sites are pre-determined by cell-specific regulatory architecture. However, MultiGPS-defined condition-independent sites are not predicted by pre-existing regulatory signals, suggesting that Cdx2 can bind to a subset of locations regardless of genomic environment. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5.National Science Foundation (U.S.) (Graduate Research Fellowship under Grant 0645960)National Institutes of Health (U.S.) (grant P01 NS055923)Pennsylvania State University. Center for Eukaryotic Gene Regulatio

    Socio-economic drivers of specialist anglers targeting the non-native European catfish (Silurus glanis) in the UK.

    Get PDF
    Information about the socioeconomic drivers of Silurus glanis anglers in the UK were collected using questionnaires from a cross section of mixed cyprinid fisheries to elucidate human dimensions in angling and non-native fisheries management. Respondents were predominantly male (95%), 30-40 years of age with Β£500 per annum. The proportion of time spent angling for S. glanis was significantly related to angler motivations; fish size, challenge in catch, tranquil natural surroundings, escape from daily stress and to be alone were considered important drivers of increased time spent angling. Overall, poor awareness of: the risks and adverse ecological impacts associated with introduced S. glanis, non-native fisheries legislation, problems in use of unlimited ground bait and high fish stocking rates in angling lakes were evident, possibly related to inadequate training and information provided by angling organisations to anglers, as many stated that they were insufficiently informed

    RNAcontext: A New Method for Learning the Sequence and Structure Binding Preferences of RNA-Binding Proteins

    Get PDF
    Metazoan genomes encode hundreds of RNA-binding proteins (RBPs). These proteins regulate post-transcriptional gene expression and have critical roles in numerous cellular processes including mRNA splicing, export, stability and translation. Despite their ubiquity and importance, the binding preferences for most RBPs are not well characterized. In vitro and in vivo studies, using affinity selection-based approaches, have successfully identified RNA sequence associated with specific RBPs; however, it is difficult to infer RBP sequence and structural preferences without specifically designed motif finding methods. In this study, we introduce a new motif-finding method, RNAcontext, designed to elucidate RBP-specific sequence and structural preferences with greater accuracy than existing approaches. We evaluated RNAcontext on recently published in vitro and in vivo RNA affinity selected data and demonstrate that RNAcontext identifies known binding preferences for several control proteins including HuR, PTB, and Vts1p and predicts new RNA structure preferences for SF2/ASF, RBM4, FUSIP1 and SLM2. The predicted preferences for SF2/ASF are consistent with its recently reported in vivo binding sites. RNAcontext is an accurate and efficient motif finding method ideally suited for using large-scale RNA-binding affinity datasets to determine the relative binding preferences of RBPs for a wide range of RNA sequences and structures

    Quantitative Models of the Mechanisms That Control Genome-Wide Patterns of Transcription Factor Binding during Early Drosophila Development

    Get PDF
    Transcription factors that drive complex patterns of gene expression during animal development bind to thousands of genomic regions, with quantitative differences in binding across bound regions mediating their activity. While we now have tools to characterize the DNA affinities of these proteins and to precisely measure their genome-wide distribution in vivo, our understanding of the forces that determine where, when, and to what extent they bind remains primitive. Here we use a thermodynamic model of transcription factor binding to evaluate the contribution of different biophysical forces to the binding of five regulators of early embryonic anterior-posterior patterning in Drosophila melanogaster. Predictions based on DNA sequence and in vitro protein-DNA affinities alone achieve a correlation of ∼0.4 with experimental measurements of in vivo binding. Incorporating cooperativity and competition among the five factors, and accounting for spatial patterning by modeling binding in every nucleus independently, had little effect on prediction accuracy. A major source of error was the prediction of binding events that do not occur in vivo, which we hypothesized reflected reduced accessibility of chromatin. To test this, we incorporated experimental measurements of genome-wide DNA accessibility into our model, effectively restricting predicted binding to regions of open chromatin. This dramatically improved our predictions to a correlation of 0.6–0.9 for various factors across known target genes. Finally, we used our model to quantify the roles of DNA sequence, accessibility, and binding competition and cooperativity. Our results show that, in regions of open chromatin, binding can be predicted almost exclusively by the sequence specificity of individual factors, with a minimal role for protein interactions. We suggest that a combination of experimentally determined chromatin accessibility data and simple computational models of transcription factor binding may be used to predict the binding landscape of any animal transcription factor with significant precision

    Why Should We Preserve Fishless High Mountain Lakes?

    Get PDF
    High mountain lakes are originally fishless, although many have had introductions of non-native fish species, predominantly trout, and recently also minnows introduced by fishermen that use them as live bait. The extent of these introductions is general and substantial often involving many lakes over mountain ranges. Predation on native fauna by introduced fish involves profound ecological changes since fish occupy a higher trophic level that was previously inexistent. Fish predation produces a drastic reduction or elimination of autochthonous animal groups, such as amphibians and large macroinvertebrates in the littoral, and crustaceans in the plankton. These strong effects raise concerns for the conservation of high mountain lakes. In terms of individual species, those adapted to live in larger lakes have suffered a higher decrease in the size of their metapopulation. This ecological problem is discussed from a European perspective providing examples from two study areas: the Pyrenees and the Western Italian Alps. Species-specific studies are urgently needed to evaluate the conservation status of the more impacted species, together with conservation measures at continental and regional scales, through regulation, and at local scale, through restoration actions, aimed to stop further invasive species expansions and to restore the present situation. At different high mountain areas of the world, there have been restoration projects aiming to return lakes to their native fish-free status. In these areas autochthonous species that disappeared with the introduction of fish are progressively recovering their initial distribution when nearby fish-free lakes and ponds are available
    • …
    corecore